Computing Join Aggregates over Private Tables
نویسندگان
چکیده
We propose a privacy-preserving protocol for computing aggregation queries over the join of private tables. In this problem, several parties wish to share aggregated information over the join of their tables, but want to conceal the details that generate such information. The join operation presents a challenge to privacy preservation because it requires matching individual records from private tables without letting any non-owning party know the actual join values or make any inference about the data in other parties. We solve this problem by using a novel private sketching protocol that securely exchanges some randomized summary information about private tables. This protocol (1) conceals individual private values and their distributions from all non-owning parties, (2) works on many general forms of aggregation functions, (3) handles group-by aggregates, and (4) handles roll-up/drill-down operations. Previous works have not provided this level of privacy for such queries.
منابع مشابه
Running Head: COMPUTING JOIN AGGREGATES Computing Join Aggregates over Private Tables
We propose a privacy-preserving protocol for computing aggregation queries over the join of private tables. In this problem, several parties wish to share aggregated information over the join of their tables, but want to conceal the details that generate such information. The join operation presents a challenge to privacy preservation because it requires matching individual records from private...
متن کاملThe Dimension-Join: A New Index for Data Warehouses
There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when bein...
متن کاملAn Overview of Cost-based Optimization of Queries with Aggregates
The optimization problem of Select-Project-Join queries has been studied extensively. However, a problem that has until recently received relatively less attention is that of optimizing queries with aggregates. For example, for single block SQL, traditional query processing systems directly implement SQL semantics and defer execution of grouping until all joins in the FROM and WHERE clauses hav...
متن کاملEfficient Index-based Processing of Join Queries in DHTs
Massively distributed applications require the integration of heterogeneous data from multiple sources. Peer-to-peer (P2P) is one possible network model for these distributed applications and among P2P architectures, distributed hash table (DHT) is well known for its routing performance guarantees. Under a general distributed relational data model, join query operator, an essential component to...
متن کاملEvaluating Join Performance on Relational Database Systems
The join operator is fundamental in relational database systems. Evaluating join queries on large tables is challenging because records need to be efficiently matched based on a given key. In this work, we analyze join queries in SQL with large tables in which a foreign key may be null, invalid or valid, given a referential integrity constraint. We conduct an extensive join performance evaluati...
متن کامل